High-Performance Machine Learning for Large-Scale Data Classification considering Class Imbalance
نویسندگان
چکیده
منابع مشابه
Large-Scale Machine Learning for Classification and Search
Large-Scale Machine Learning for Classification and Search
متن کاملLarge-scale machine learning for metagenomics sequence classification
MOTIVATION Metagenomics characterizes the taxonomic diversity of microbial communities by sequencing DNA directly from an environmental sample. One of the main challenges in metagenomics data analysis is the binning step, where each sequenced read is assigned to a taxonomic clade. Because of the large volume of metagenomics datasets, binning methods need fast and accurate algorithms that can op...
متن کاملLarge Scale Machine Learning
Cette thèse aborde de façon générale les algorithmes d'apprentissage, avec un intérêt tout particulier pour les grandes bases de données. Après avoir for-mulé leprobì eme de l'apprentissage demanì ere mathématique, nous présentons plusieurs algorithmes d'apprentissage importants, en particulier les Multi Layer Perceptrons, les Mixture d'Experts ainsi que les Support Vector Machines. Nous consid...
متن کاملhi-RF: Incremental Learning Random Forest for large-scale multi-class Data Classification
In recent years, dynamically growing data and incrementally growing number of classes pose new challenges to large-scale data classification research. Most traditional methods struggle to balance the precision and computational burden when data and its number of classes increased. However, some methods are with weak precision, and the others are time-consuming. In this paper, we propose an incr...
متن کاملChimera: Large-Scale Classification using Machine Learning, Rules, and Crowdsourcing
Large-scale classification is an increasingly critical Big Data problem. So far, however, very little has been published on how this is done in practice. In this paper we describe Chimera, our solution to classify tens of millions of products into 5000+ product types at WalmartLabs. We show that at this scale, many conventional assumptions regarding learning and crowdsourcing break down, and th...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
ژورنال
عنوان ژورنال: Scientific Programming
سال: 2020
ISSN: 1058-9244,1875-919X
DOI: 10.1155/2020/1953461